Precision vs Confidence Tradeoffs for ℓ2-Based Frequency Estimation in Data Streams

نویسنده

  • Sumit Ganguly
چکیده

We consider the data stream model where an n-dimensional vector x is updated coordinate-wise by a stream of updates. The frequency estimation problem is to process the stream in a single pass and using small memory such that an estimate for xi for any i can be retrieved. We present the first algorithms for `2-based frequency estimation that exhibit a tradeoff between the precision (additive error) of its estimate and the confidence on that estimate, for a range of parameter values. We show that our algorithms are optimal for a range of parameters for the class of matrix algorithms, namely, those whose state corresponding to a vector x can be represented as Ax for some m × n matrix A. All known algorithms for `2-based frequency estimation are matrix algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Layered space-time equalization for wireless MIMO systems

In this paper we investigate layered space-time equalization (LSTE) architectures for multiple-input-multiple-output (MIMO) frequency selective channels. At each layer or stage of detection, a MIMO delayed decision feedback sequence estimator (MIMO-DDFSE) is used to tentatively detect a group of selected data streams, among which a sub-group of data streams are output and are canceled from the ...

متن کامل

An Efficient RFID Data Cleaning Method Based on Wavelet Density Estimation

A large number of noise are usually carried in the original RFID data and need to be cleaned up before further processing. Outlier detection is an effective method for RFID data cleaning. In this paper, a point probability data model was proposed to describe the uncertain RFID data streams. The wavelet density threshold was incorporated in this method to adaptively detect the outliers in the sl...

متن کامل

Coordinate Descent Algorithms for Lasso Penalized Regression

Imposition of a lasso penalty shrinks parameter estimates toward zero and performs continuous model selection. Lasso penalized regression is capable of handling linear regression problems where the number of predictors far exceeds the number of cases. This paper tests two exceptionally fast algorithms for estimating regression coefficients with a lasso penalty. The previously known ℓ2 algorithm...

متن کامل

Efficient Distributed Precision Control in Symmetric Replication Environments

Maintaining strict consistency of replicated data can be prohibitively expensive for many distributed applications and environments. In order to alleviate this problem, some systems allow applications to access stale, imprecise data. Due to relaxed correctness requirements, many applications can tolerate stale data but require that the imprecision be properly bounded. This paper describes ReBou...

متن کامل

Estimating Data Stream Quality for Object-Detection Applications

Object-detection applications rely on streams of data gathered from sensors, RFID readers, and image recognition systems, among others. These raw data streams tend to be noisy, including both false positives (erroneous readings) and false negatives (missed readings). Techniques exist for general-purpose cleaning of these types of data streams, based on temporal and/or spatial correlations, as w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012